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Abstract 

All Colors Shortest Path problem defined on an undirected graph aims at finding a shortest, possibly non-simple, path 
where every color occurs at least once, assuming that each vertex in the graph is associated with a color known in advance. 
To the best of our knowledge, this paper is the first to define and investigate this problem. Even though the problem 
is computationally similar to generalized minimum spanning tree, and the generalized traveling salesman problems, 
allowing for non-simple paths where a node may be visited multiple times makes All Colors Shortest Path problem novel 
and computationally unique. In this paper we prove that All Colors Shortest Path problem is NP-hard, and does not 
lend itself to a constant factor approximation. We also propose several heuristic solutions for this problem based on 
LP-relaxation, simulated annealing, ant colony optimization, and genetic algorithm, and provide extensive simulations 
for a comparative analysis of them. The heuristics presented are not the standard implementations of the well known 
heuristic algorithms, but rather sophisticated models tailored for the problem in hand. This fact is acknowledged by the 
very promising results reported. 

Keywords: NP-hardness, inapproximability, LP-relaxation, heuristic algorithms, simulated annealing, ant colony 
optimization, genetic algorithm. 


1. Introduction 

Given an undirected edge weighted graph where each 
vertex has an apriori assigned color. All Colors Shortest 
Path (ACSP) problem is defined as a generic problem in 
which the aim is to find a shortest possibly non-simple path 
that starts from a designated vertex, and visits every color 
at least once. As the same node might need to be visited 
multiple times, the path is not necessarily simple. This 
makes ACSP a novel and unique problem that has never 
been studied before to the best of our knowledge. As the 
problem is generic enough, it can be applied to a broad 
range of possible areas including mobile sensor roaming, 
path planning, and item collection. 

In this paper, we study ACSP problem, prove that the 
problem is NP-hard, and that a constant factor approxi¬ 
mation algorithm cannot exist unless P = NP. An ILP 
formulation is developed for A CSP, and elaborate heuristic 
solutions to this optimization problem are also provided. 
These heuristics are based on LP-relaxation, simulated an¬ 
nealing, ant colony optimization, and genetic algorithm. 
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An experimental study is carried out to compare them, 
and report the results. 

The remainder of the paper is organized as follows. In 
Section we discuss the related work, and position our 
paper with respect to the state of the art. In Section]^ we 
formally define the problem, and provide the intractabil¬ 
ity proof along with an inapproximability result. Section]^ 
presents an ILP formulation for ACSP. In Section we 
discuss the heuristic solutions we propose. The experi¬ 
mental results are presented in Section]^ and the paper is 
concluded in Section [3 

2. Related Work 

ACSP, defined and investigated in this paper, has actu¬ 
ally features that make it look similar to a variety of prob¬ 
lems studied extensively in the literature, each of which, 
however, has one or more discrepancies making ACSP 
computationally unique. Among these. Generalized Mini¬ 
mum Spanning Tree (GMST) problem introduced in [I6j 
is probably the most similar to A CSP. Given an undirected 
graph partitioned into a number of disjoint clusters, GMST 
problem is defined to be the problem of finding the mini¬ 
mum cost spanning tree with exactly one node from every 
cluster. This problem has been shown to be NP-hard in 
|16] . and some inaproximability results are presented in 
m- Integer Linear Programming {ILP) formulations for 
this problem are presented in [3, [TH], and [T5]. There 
exist formulations for also a variant of GMST in [3] and 
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m where at least one instead of exaetly one node from 
each cluster is visited. We refer to the latter version as 
£-GMST. Even though there are such formulations, ACSP 
still differs in the shape of the solution. While ACSP out¬ 
puts a possibly non-simple path, £-GMST returns a tree. 
Moreover, it can be easily noted that a minimum spanning 
tree returned by £-GMST can only give a rough estimate 
for the size of a possibly non-simple shortest path visiting 
all the colors even when A CSP is required to return to the 
base it starts off as shown in Figure When the nodes 
with the same color are perceived as disjoint clusters so as 
to interpret this figure as an instance of £-GMST, the tree 
spanning nodes 1 through 6 is the optimal solution to it 
with cost 5. 



Figure 1: An example graph corresponding to an instance of ACSP. 
All the edges have a weight of 1, and the colors assigned to the nodes 
are shown next to them. Node 1 is designated as the base. The 
shortest path for this instance of ACSP is 1,2,7,8,9,10)11 which 
has a length of 6. When the path is constrained to return to the 
base, however, the path length of the solution becomes 8. 

Another problem seemingly similar to ACSP is Gen¬ 
eralized Traveling Salesman Problem (GTSP) formulated 
first in [12] . Given a group of possibly intersecting clusters 
of nodes, GTSP tries to find a shortest Hamiltonian tour 
with at least one (or exactly one) visit to a node from ev¬ 
ery cluster. An integer linear programming formulation for 
GTSP when the distance matrix is asymmetrical is given 
in [13]. In [14], it is shown that a given instance of GTSP 
can be transformed into an instance of standard TSP. In 
[5], GTSP is noted to be NP-hard as standard TSP is a 
specialization of GTSP with clusters in the form of single- 
ton nodes. It is also surprising to note as [T] demonstrates 
that GTSP can be transformed into standard TSP very 
efficiently with the same number of nodes, but with a mod¬ 
ified distance matrix. ACSP differs from also these vari¬ 
ants of GTSP, in that, the nodes may be visited multiple 
times, and the path returned need not be a cycle. 

3. The Problem Definition 

ACSP is modeled as a graph problem. The input to 
the problem is an undirected edge weighted graph where 
each vertex is assigned a color known in advance. The goal 


is then to find the shortest possibly non-simple path that 
visits every distinct color at least once in this graph. The 
formal definition of the problem is given as: 

Definition 3.1. Given an undirected graph G{V,E) with 
a color drawn from a set C of colors assigned to each 
node, and a non-negative weight associated with each edge, 
ACSP is the problem of finding the shortest (possibly non¬ 
simple) path starting from a designated base node s £ V 
such that every color occurs at least once on the path. 

The weights Wij where {i,j) G E in G correspond to 
distances. We will use the words weight, cost, and dis¬ 
tance interchangeably throughout the paper. The cost of 
a solution to an instance of ACSP is simply the length of 
the path returned. 

ACSP can easily be shown to be NP-hard by a trivial 
polynomial time reduction from Hamiltonian Path (HP) 
problem which is well-known to be NP-complete [7] . Given 
an undirected graph G{V, E), HP is defined to be the prob¬ 
lem of deciding whether it has a Hamiltonian path, namely, 
a simple path that visits every node in the graph exactly 
once. 

3.1. NP-hardness of ACSP 

Given an instance of HP, it can be transformed to the 
corresponding instance of ACSP as follows: Let the graph 
in the given HP instance be denoted by G{V,E). A new 
graph G'{V U {s},!!! U {(s,z;)|r! S V}) is obtained by 
adding to G a new node s, and also the edges from s 
to all the original nodes in G. Next, a distinct color from 
G = {ci,C 2 ,..., C|v|_|_i} is assigned to each and every node 
in G'. The weights associated with all the edges in G' are 
finally set to one. We can now state the following lemma. 

Lemma 3.2. A given instance of HP represented with 
G{V, E) has a solution if and only if the corresponding 
instance of ACSP obtained through the lines of transfor¬ 
mation just depicted has a solution with length \V\. 

Proof. Let us first prove the only if part. When the given 
instance of HP has a solution, there must exist a Hamilto¬ 
nian path P in G given by u^(i)U^(2)---i'77(i)i^77(i+i)---^^77(|y|) 
of length |V| —1. As P is a Hamiltonian path, the permuta¬ 
tion TT of nodes in V is such that the edges (u^(i), r'7r( i-hi)) G 
E for all i G {l..|V| — 1}. If we let G, and G'{V',E') de¬ 
note the set of \V\ -\- 1 colors, and the transformed graph 
respectively in the corresponding instance of ACSP, it is 
then possible to construct the path P' = sP in G' with 
total path length \V\ where s G V is designated as the 
base node. This is apparently the shortest path visiting 
all distinct colors at least once. 

In order to prove the if part, let us assume that we have 
a shortest path of length \V\ that starts with node s in the 
corresponding instance of ACSP. Since the total number 
of colors that needs to be visited is \V\ -\- I, each distinct 
color, and hence, the corresponding node occurs exactly 
once on this path. The removal of node s readily specifies 
a Hamiltonian path in G of the given HP instance. □ 
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The following theorem can hence be stated now. 
Theorem 3.3. ACSP is NP-hard. 

Proof. It is a direct consequence of Lemma |3.2| □ 

Having learned about the NP-hardness of ACSP, a pos¬ 
sible next step is to explore its approximability. With this 
objective in mind, our attention was drawn to i-GMST 
problem having a similar computational structure. While 
i-GMST looks for the minimum cost spanning tree, ACSP 
seeks out a possibly non-simple path with at least one node 
from every cluster provided that the nodes with the same 
color are interpreted as disjoint clusters. The following 
observation is first made to associate the optimal values of 
the respective solutions attained by both problems when 
fed with the same input. It is then used to report a result 
regarding the approximability of ACSP. 


It is shown in [TU] that i-GMST, referred to as CLASS 
TREE problem in the paper, does not have a constant- 
factor polynomial time approximation algorithm (apx) un¬ 
less P = NP. 

Theorem 3.5. ACSP does not have a constant-factor 
polynomial time approximation algorithm unless P = NP. 

Proof. Let us assume, to the contrary, that ACSP has an 
apx denoted by apxj^Qgp. Based on this assumption, an 
apx for i-GMST can be shown to also exist, and hence a 
contradiction, as follows. 

Given any valid input I for i-GMST, consisting of an 
undirected graph G{V, E) along with disjoint clusters Vi C 
V with I < i < fc, we denote by Ij the input for ACSP 
obtained from / by designating j £ V as the base. The 
initial assumption with regard to the existence of an apx 
suggests by definition 


Proposition 3.4. Let I correspond to an input identified 
by an undirected edge weighted graph G(V,E), and a func¬ 
tion K : E —> {I,2,...,fc} mapping the vertices to colors. 
For 1 < i < k, Vi = {v G V\k(v) = i} induce clearly a 
set of k disjoint clusters, which in turn allows for a proper 
interpretation of I by i-GMST. Then, 


^P^ACSP — ^-P^ACSP i^j) < C* Optj^Qgp {Ij) 

for some constant c > 1, and all valid input Ij where j G V. 
Taking the minimum over all j G V, we obtain 

min{optAcspiIj)} < miniapx^cspip)} < c* mm{opt^cSp(L)}- 


OPk-GMST (1) < ^^f}{optACSP (Ij)} < 2 * opt/, -GMST {!) 
j&V 


Combining this result with Proposition 3.4 


holds for all valid instances I, where Ij is obtained from I 
by designating j G V as the base node, and opt a returns 
the cost of the optimal solution to its argument interpreted 
as an instance of either one of the two problems as dictated 
by the subscript A. 

Proof. Let us assume that the first inequality in the propo¬ 
sition does not hold, and, there is an instance I for which 
opk-GMST {!) > minjevl (Ij)}- Let us suppose 
that s £ E is a node that minimizes the right-hand side of 
this inequality. In that case, the solution to ACSP with 
cost optjjQpp (Is) can be easily reworked, by simply elim¬ 
inating any cycles, and duplicate edges, into a tree T'. T' 
is clearly a solution to i-GMST for the instance J with 
cost less than opk_Qf,fgp (/), and hence, contradicting the 
assumption. 

Let us assume now the latter inequality does not hold. 
This, for at least one instance of input I, leaves us with 
iAiinj^\/^optj^QPP ^ 2 ^Pl'i-GMSP (L)- Let us also 
assume that the tree T'[y',E') is a solution to i-GMST 
with cost opti_Qj,^pp (/) for instance I. Rooting T' at some 
s £ E', a possibly non-simple path starting from s could 
be constructed visiting all the nodes in it by a depth-first 
search. This path, however, forms a solution to A CSP for 
instance A with cost strictly less than 2* opt^_Qj^gp (/) as 
no edge gets visited more than twice, and there exists at 
least one edge that is visited exactly once given that the 
return to the base is not performed upon hitting the last 
leaf node in E'. This, however, contradicts the assump¬ 
tion. □ 


opk-GMST (L) < {Ij)} < ‘I*c*opt^_Qpigp (/) 

j&v 

is readily obtained. It should be noted that the mini¬ 
mization over apxAcsp {Ij) involves running the constant- 
factor approximation for ACSP separately for each j G V , 
and the total time, even though amplified by a factor of 
|E|, is still polynomial in the size of a given instance. 
Therefore, the last inequality implies, by definition, a 2c- 
factor apx for i-GMST. This, however, is a contradiction, 
and hence, the proof. □ 

4. ILP Formulation of ACSP 

In this section, an Integer Linear Programming formu¬ 
lation of ACSP is presented. To this end, we start by 
making the following observation first. 

Proposition 4.1. In an optimal solution to any instance 
of ACSP, no edge can be visited more than once in any 
given direction. 

Proof. We assume that p is a possibly non-simple path 
with the shortest distance, forming a solution to a given 
instance of ACSP. Contrary to the proposition, we proceed 
by assuming that an edge {i,j) is traversed more than once 
in the direction from node i to node j. Highlighting the 
first two occurrences of this edge, then, the path can be 
represented as p = s, x, i,j, y, i,j,z where s is the base, and 
X, y, and z are sequences of zero or more nodes with edges 
in between consecutive nodes. It should be noted that 
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neither x nor y are allowed, by the assumption, to have 
any occurrences of i, and j consecutively in this order. We 
can, in that case, construct a new path p' = s, x, i, y^,j, z 
with y^ corresponding, in reverse order, to the sequence 
of nodes in y. This new path, p', visiting the same set of 
nodes as p, however, is shorter by 2 * Wij than p. This 
contradicts the optimality of p, and hence, proving the 
proposition. □ 

Proposition |4.1| allows for an ILP formulation to ACSP 
where tracking down whether an edge is visited as part 
of an optimal solution in either one of the two possible 
directions becomes possible by employing a binary decision 
variable. This observation, coupled with the motivation to 
come up with a compact ILP model, form the basis of the 
transformation to be described next. At the heart of the 
transformation is the replacement of each undirected edge 
in a given instance of ACSP with two directed edges, and 
hence the adoption of a directed graph view as a substitute 
in the ILP formulation. 

Let us assume that an instance of ACSP is given, as 
determined by an undirected edge weighted graph G{V, E), 
the designated base vertex s G V, and k :V ^ C mapping 
the vertices V = {1,..., n} to colors C = {1,..., k}. Finally, 
the weights associated with the edges in G are denoted 
by Wij for all unordered pairs {i,j) (or G E. It is 

therefore implicitly assumed that Wij = wj^i for all (*, j) G 
E. 

In transforming G{V,E) to a directed graph G'{V',E') 
to be used in the ILP formulation, we first introduce two 
new nodes numbered 0 as the source, and n + 1 as the sink, 
setting effectively V' = V LI {0,n + 1} in G'. Besides, the 
source, and the sink are both assigned to a new color 0, 
extending the color set to C" = CU{0}. With the addition 
of the new color, k is also augmented accordingly with 
k{ 0) = K{n+ 1) = 0. Then, a directed edge (0, s) from the 
new source to the base s, as well as directed edges {i,n + l) 
to the sink, for all i G P in G, are added into G' with their 
weights set to 0. Lastly, each undirected edge (i, j) G E 
is replaced by two directed edges (z,j) and (j, *) in G' 
with both of whose weights initialized to the weight of the 
original edge. With this hnal step, the transformation sets 
E' = {(i, j), (j,z)|(i, j) G £'}U{(0,s)}U{(z,n + l)|z G V} 
in G'. Continuing to use the same notation for weights in 
G', = 0, and Wi^n+i = 0 for all z G P are added after 

the existing Wij = Wj^i for all unordered pairs (z, j) G E. 

Any possibly non-simple path, p, starting from the des¬ 
ignated base s, and visiting all colors at least once in G, 
corresponds precisely to the path 0,p,n -I- 1 in G', where 
nodes 0, and rz-l-l are the source, and the sink respectively. 
In the same way, a possibly non-simple path p = 0, p', n-l-1 
in G' , where p' is a possibly non-simple path starting at 
s, and with length at least one, corresponds to p' in G. 
As a result, the feasible solutions in G, and G' will be in 
one-to-one correspondence, as long as the ILP formulation 
of ACSP can place a restriction on any feasible solution 
in G' to start from the source, and to terminate at the 


sink. Moreover, these corresponding solutions have both 
the same cost. It is hence obvious that a solution to an in¬ 
stance of ACSP on G as given above is optimal if and only 
if the corresponding solution on the transformed instance 
employing G' is also optimal. 

The ILP formulation for a given instance of ACSP can 
now be stated with reference to the transformation de¬ 
scribed above. 


minimize Xij * Wij 


(1) 

subject to 

^0,s — 1 


(2) 

1 I 

VcG G' 

(3) 

A K(j) = C 


Vz G F 

(4) 

r-U,i)&E' j-ii,j)eE' 

Vj > 

V(z,j) G A' 

(5) 

Y. - yj 

Vj G F' \ {0} 

(6) 

i-.{i,j)&E' 

'Y fj,i—yi+ Y^ > 

Vi G F 

(7) 

j:(j,i)eE' j:{i,j)eE' 

Xi^j Si fi,j — (zz “t“ 1) * Xi^j , 

V(z,j) G A' 

(8) 

Xi,j ^ {0? 7 

V(z,j) G A' 

(9) 

Vi e {0,1} 

Vz G F' \ {0} (10) 

fi,3 ^ {^7 I 7 •••7 n + 1} , 

V(z,j) G A' 

(11) 

The objective in this formulation is to 

minimize the 

sum 


of the weights over all the directed edges that have been 
visited as shown in Q. The binary variable Xi^j is set to 1 
when the directed edge (z, j) is visited, and to 0 otherwise. 
It should be noted that all the edges involving the source, 
and the sink, introduced later in the transformation, with 
weight zero have no effect on the objective. Constraint ([^ 
ensures that the edge from the source to the base is always 
a part of any feasible solution. Therefore, any feasible path 
always starts from the source, and then moves straight to 
the base. Constraint ^ demands for each distinct color 
that the number of the visited edges directed at the nodes 
with this same color is at least one. As a result, every 
distinct color gets visited at least once. As the constraint 
must also hold for color 0, any feasible path is guaranteed 
to terminate at the sink. Constraint Q is used to make 
sure that the number of the visited edges that enter into 
any node z in G is equal to the number of the visited edges 
that leave it. This obviously holds for all the nodes, but 
the source, and the sink in G'. The main ingredient in en¬ 
forcing the shape of the solution to a possibly non-simple 
path is this constraint. Constraints ([^, and § establish 
collectively the rules associated with the variables yj for 
all j G F' \ {0}. The binary decision variable yj is set to 1 
if and only if node j has been visited in a feasible solution. 
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Constraint simply asserts that visiting an edge {i,j) 
is an implication of visiting node j while Constraint 
predicates the converse. Constraint Q, along with 
is used to eliminate any possible sub-tours, and to ensure 
connectedness to the base. Constraint Q employs non¬ 
negative integer valued flow variables, denoted by fij for 
all edges (i,j) S E'. It enforces the total flow into a vis¬ 
ited node to be equal to one greater than the total flow 
out of that node. In formulating this constraint, it is as¬ 
sumed that the source supplies a limited amount of flow 
to distribute to those nodes that are visited in any feasible 
solution. Hence, each node visited consumes a unit flow. 
Constraint Q is in charge of regulating the flow values. A 
flow is associated with an edge if and only if that edge is 
part of a solution. As the flow is conserved at all the nodes 
in the original graph, the base node s is no exception. Cou¬ 
pled with the fact that each node visited consumes a unit 
flow, the edge (0, s) should carry as many unit flows as 
there are nodes to visit. Excluding the source leaves us 
with a maximum of n -I- 1 nodes, and hence, the factor in 
Q. Finally, the constraints (|^, (10), and (11) are the in¬ 
tegrality constraints for the decision variables , yi, and 
fij respectively. 


5. Heuristic solutions 


In this section, we describe our heuristic solutions to 
the intractable ACSP problem. Section [Sd] explains sev¬ 
eral heuristic solutions based on LP-relaxation. Simulated 
annealing, ant colony optimization, and genetic algorithm 
based heuristic solutions to ACSP are presented later in 
sections |5.2[ |5.3[ and |5.4| respectively. 


5.1. LP Relaxation 

The given ILP formulation, 0 through ( [TT| ), is relaxed 
to an LP by replacing the integrality constraints 
and (11) with 


0 < Xij < 1 

, y{i,j)GE' 

0) 

0 < 2/z < 1 

,yiGV'\ {0} 

<[0I) 

0 < fi,j <n-\-l 

, y{i,j)GE' 

0) 


respectively. Now, the decision variables can take on real 
values. 

We propose several heuristics based on rounding the so¬ 
lutions to this LP relaxation. Having learned from The¬ 
orem |3.5| that a constant-factor approximation does not 
exist for ACSP, we explore strategies based on iterative 
rounding, rather than typical one-shot rounding. 

The first heuristic, called LP^ACSP, after obtaining the 
optimal solution to the LP relaxation, finds the maximum 
value strictly less than one among all xij G E'. The set of 
all indexes for which this maximum is attained is denoted 
by y (i.e., y = argmax Next, the LP re- 

((hi)6S')A(0<Xi,j<l) 

taxation formulation at hand is augmented with the addi¬ 
tional constraints in the form of Xi^- = 1 for all (*,j) G y. 


Finally, a subsequent call to LP for the extended formula¬ 
tion is issued. Hence, the job, in this subsequent call, be¬ 
comes finding the shortest possibly non-simple path that 
fulfills not only the previous set of constraints but also 
passes through every additional edge explicitly dictated 
by the added constraints. This process is repeated until 
no fractional values to process are left, and hence /r = 0. 

The other two heuristics, called LPfACSP, and 
LPf/j;ACSP use exactly the same strategy described 
above except for how y, is computed before a call 
to LP. While LPfACSP relies on the flow variables 
(/i = argmax fij) in deciding which addi- 

tional Xij values to round before the next iteration, 
LPf/,^ACSP bases its decision on the ratios of fij/xij 
{y= argmax ^). 


5.2. Simulated Annealing 

We develop another heuristic solution for ACSP, based 
on Simulated Annealing (SA) [TT]. This new heuristic, 
called SA-ACSP, can be described in three primary parts: 

(1) Choosing an initial random path: The general 
outline of the algorithm for SA-ACSP is given in Figurej^ 
The algorithm starts with a random possibly non-simple 
path that visits every color at least once. Such a path 
is constructed by randomly extending an existing path, 
originating from the base, until it visits every color at least 
once. This process is performed only once, in linef^ at the 
start of each iteration in the while loop in lines [hjthrough 

(2) Generating neighbors: We generate a neighbor 
by removing the last node in the current state, and then, 
adding to a random position in the path the closest node 
with the same color as the removed node. 

(3) Selecting the best path: Starting from an initial 
temperature, denoted by T in the algorithm in Figure 
the system is cooled down until a frozen state is reached, 
where the temperature is close to zero. Cooling down is 
done by decreasing the temperature slightly at each iter¬ 
ation as seen in line[^ The symbol R there corresponds 
to the cooling rate. The energy of each state is defined by 
the cost of the path selected at that stage. As the tem¬ 
perature approaches to that of a frozen state, SA-ACSP 
keeps exploring the neighbors. At each iteration, a neigh¬ 
bor solution is discovered in the search space, and chosen 
probabilistically according to Metropolis Criterion m, 

p{AE) = 


where k is Boltzmann’s constant, T is the temperature, 
and AE is the difference between the energies of the cur¬ 
rent, and the neighbor solutions. If the total energy de¬ 
creases, the new state is assumed right away. Otherwise, 
the system chooses to go to the new state according to the 
probability that is produced by Metropolis criterion. 
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procedure Sa-Acsp 
0; 

bestCost <r- oo; 

T <r- SetInitialTemperature{)\ 
iterationCount -h- noOfNodes * noOfColors/5', 
while T > freezingTemp do 

localBestPath ^ findARandomPath{)', 
localBestCost cost Of {localBestPath); 

for i 0 to iterationCount do 

nextPath findANeighbourPathQ; 

nextCost costOf{nextPath); 

AE •<— nextCost — localBestCost; 
if AE < 0 then 

localBestPath nextPath; 
localBestCost nextCost; 
else 

r ■<— Random{0, 1); 
if r < then 

localBestPath •<— nextPath; 
localBestCost nextCost; 

end if 
end if 
end for 

if localBestCost < bestCost then 
bestPath localBestPath; 
bestCost localBestCost; 

end if 
T^T*R; 
end while 
return bestCost; 
end procedure 


pheromone level on at least one incident edge is not zero. 
In this case, the edge selection probability calculation is 
performed, based on the pheromone levels, as: 

. *J:,eciPh^Ak)r 

* J:k^ciPh^Ak)r] ’ 

where Phij{k) is the pheromone level on edge {i,j) S E 
associated with color k G C, a, and /3 are user defined 
parameters with 0 < a < /? < 1, and the desirability Dij 
of edge {i,j) G E is defined to be inversely proportional to 
the edge’s cost as Dij = Ijwij. 

Pheromone level updates are carried out in two different 
ways, namely, the local, and the global updates. The local 
updates are applied to all the edges selected because each 
ant secretes pheromone as it moves on the edges. More¬ 
over, the pheromones are not stored only on the edges. 
Ants also have some pheromones within themselves, and 
their levels drop while they are secreted by the ants during 
their traversal. Therefore, the local updates are performed 
on the edges as well as the ants selecting them. While the 
local update to the pheromone level corresponding to color 
k G C, after the selection of edge (i, j) G E hy ant t, is 
performed by 


Phij{k) = (1 — (5) * Phi j{k) -I- d * Pht{k), 


the level of the pheromone associated with color k stored 
on ant t becomes the subject of the local update 


Figure 2: SA-ACSP: Heuristic based on simulated annealing. 


5.3. Ant Colony Optimization 

In this section, we present the details of how Ant Colony 
Optimization (AGO) [2J [3] is applied to obtain ACO- 
ACSP, another heuristic solution, to ACSP problem. 

In ACSP, each color needs to be visited at least once. 
Therefore, the ant colony optimization algorithm is imple¬ 
mented to visit multiple food types, where each food type 
corresponds to a distinct color. In other words, when an 
ant leaves the nest, its search is not over until it finds a 
path that passes over every food type. The base node is 
chosen as the nest of all the ants, and at each iteration, 
the entire colony of ants is released from this nest to the 
graph. 

The random movements of the ants, while visiting a 
node, are governed by an edge selection procedure. De¬ 
pending on whether there is any trace of pheromone on an 
incident edge, the ants compute two types of edge selection 
probabilities. In the first one, when there is no pheromone 
on any incident edge, the ants make the selection based on 
the edge costs (or distances) using the following formula. 


probij 


'l2k-.{i,k)eEiPt — Wi^k) 


where probij is the selection probability of edge {i,j) G E, 
and Co is a constant. The second case occurs when the 


Pht{k) = Pht{k) - 5 * Pht{k), 


where d is a user defined evaporation parameter such that 
0 < (5 < 1. It should be noted here that the same notation 
has been employed to keep track of the pheromone levels 
on both the edges, and the ants. However, the levels of 
the pheromones stored on ants are tracked with a single 
subscripted index as opposed to two for the edges. 

The second type of the update to pheromone levels 
comes under the title of the global update. The global 
pheromone update is also known as off-line pheromone 
update. It is applied, at the end of each iteration, only 
to the edges that are on the best path found so far. The 
pheromone level for each color k G C on each such edge is 
updated using the following formula 


Phij{k) = (1 — (5) * Phij{k) -I- d * 


1 

cost (bestPath) 


where cost{bestPath) denotes the total cost of traversing 
the edges associated with the best, possibly non-simple, 
path found so far. 

The pseudo-code of ACO-ACSP heuristic algorithm, re¬ 
flecting the general anatomy of ant colony optimization as 
applied to ACSP is presented in Figure]^ 


5.f. Cenetic Algorithm 

The Genetic Algorithm (GA) [3] has five main steps: 
initialization, fitness, selection, crossover, and mutation. 
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procedure selectEdge(Ant ant) 

if ant.isDone OR ant.isDiscarded then 
ant.pheromone Update ^ false-, 
else 

incidentEdges findAvailableEdges{ant)-, 
if sizeOf{incidentEdges) = 0 then 
ant.isDiscarded ■(— true-, 
else 

ant.pheromoneUpdate-ir- true-, 
sum-(r- calcProbUsingPheromonesi)-, 
if sum = 0 then 

calcProb UsingDistancesQ; 

end if 

update Ant ToPickEdge (); 

end if 
end if 

end procedure 
procedure Aco-AcSP() 
best Ant 0; 

for j ■<— 0 to iterationCount do 

colony = createAnts[colonySize)-, 
antsDoneTour <r- 0; 
while antsDoneTour < colony Size do 
for all ant in colony do 
selectEdge{ant)-, 
updateLastSelectedEdgeQ-, 
if ant.hasAllColors then 
ant.isDone-fr- true-, 

++antsDoneTour; 
end if 
end for 
end while 

temp Ant findBestAntQ; 
updatePheromoneOnBestPath{temp Ant.path); 
if temp Ant. cost < bestAnt.cost t\ien 
bestAnt <— temp Ant; 

end if 
end for 

return bestAnt.cost; 

end procedure 


Figure 3: ACO-ACSP-. Heuristic based on ant colony optimization. 


In this section, we develop GA-ACSP which is another 
heuristic solution to ACSP based on GA. The algorithm 
is presented in Figure]^ 

During the initialization step, a pool of chromosomes, 
called population, is generated. We encode the chromo¬ 
somes in such a way that each chromosome is represented 
by an ordered list of vertices, corresponding to a solution 
to a given ACSP instance. As the path is not necessar¬ 
ily a simple path, each vertex may appear multiple times 
on a chromosome. Initially, the population is hlled with a 
certain number of randomly created, possibly non-simple, 
paths, each of which visits all the distinct colors at least 
once. 

In the next step, the fitness values are calculated for 
each chromosome in the population. The fitness value of 
a chromosome, in GA-ACSP, is simply taken as the total 


1: procedure Ga-Acsp 
2: populationtb; 

3: for i 0 to populationSize do 

4: chrm = createRandomChromosomei); 

5: ensureConnectivity{chrm); 

6: population. add{chrm); 

7: end for 

8: for i •<— 0 to iterationCount do 

9: candidates ■(— rouletteWheelSelection{population); 

10: children <r- crossOver{candidates); 

11: for all child in children do 

12: rrandom{G,l); 

13: if r < mutationProbability then 

14: mutate{child); 

15: end if 

16: completeMissingColors{ child ); 

17: ensureConnectivity{child); 

18: population. add{child); 

19: end for 

20: w2c find2chromosomesWithLouiestFitness{); 

21: population.remove{w2c); 

22: end for 

23: return costOfBestChromosome; 

24: end procedure 

Figure 4: GA-ACSP: Heuristic based on genetic algorithm. 


distance traveled down the corresponding possibly non¬ 
simple path. 

In the selection step, two candidate chromosomes are 
selected from the population for crossover. The selection 
of the candidates are performed using the roulette wheel 
selection algorithm [^, in which the chromosomes with 
higher fitness values have higher chances to be selected. 

In performing a crossover, two random positions pi, and 
P 2 with a common vertex are initially figured out in the 
first, and the second candidate chromosomes respectively. 
The portions beyond pi , and P 2 are then swapped between 
the candidates to produce two new children. In case the 
candidates do not have a common vertex, two more can¬ 
didates for crossover are selected until a vertex common 
to both can be found. In the end, two new, possibly non 
simple, paths are generated. It should be noted, however, 
that these new paths are not guaranteed to visit all the 
colors. Therefore, as soon as the crossover, and the muta¬ 
tion steps are over, we examine the paths to find a list of 
the missing colors on each, and then, modify the paths ac¬ 
cordingly so that when the process is over, each of the two 
new chromosomes represents also a non-simple path that 
visits each color at least once. In modifying the paths, 
we follow a greedy policy, and append to the tail of the 
chromosome, at each iteration, the shortest path from the 
tail to the closest vertex with a color not visited yet. This 
helps keeping the path lengths as short as possible. 

The last step in GA-ACSP is the mutation. It is car¬ 
ried out by simply replacing two random vertices in the 
chromosome, and rearranging the path to ensure that it 
remains connected. In order to connect non-neighbor ver- 
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tices in the chromosome, the shortest path between those 
vertices is inserted into the chromosome. 

After the crossover, and the mutation steps, the child 
chromosomes need to be verified to correctly represent a 
possibly non-simple path. After any possible problems re¬ 
garding missing colors, and disconnectivity are dealt with, 
as described above, there is still some more work to do. 
First, the redundant segment at the tail part of the chro¬ 
mosome should be cropped if all the colors have already 
been seen before the beginning of that segment. Lastly, 
the property that no edge can be visited more than once 
in any direction in an optimal solution can be violated as 
a result of the crossover, and the mutation operations. In 
such a case, the property should be restored, as highlighted 
in the proof to Proposition |4.1| 

Once the two newly formed child chromosomes are 
added to the population, the two chromosomes that have 
the worst fitness values are removed from the population. 


6. Experiments 

In this section, we present the results of our experi¬ 
ments for the proposed heuristic algorithms. We refer to 
LPxACSP, LPfACSP, and LPf/^ACSP under the head¬ 
ing of LP relaxation based algorithms whereas SA-ACSP, 
ACO-ACSP, and GA-ACSP are treated under the cate¬ 
gory of metaheuristic algorithms. We implemented the 
metaheuristic algorithms in C-I--I-, and used CPLEX for 
ILP, and LP relaxation based heuristics. All tests are per¬ 
formed on computers that have AMD Phenom(tm) II X4 
810 2.67 GHz CPU, and 2 Gb 400 MHz DDR2 RAM run¬ 
ning on the 32-bit operating system Ubuntu 10.04. We 
conducted the experiments on randomly generated graphs 
with varying number of nodes, and colors as listed in Table 

with an average node degree of 6, uniform color distri¬ 
bution, and an average edge weight of 10. The simulations 
are conducted 10 times for each graph type, and only the 
average, and the minimum cost values are reported. 


individual parameters specific to a metaheuristic. Finally, 
in Section [6^ an overall comparison is presented to assess 
the relative performance of all the heuristics proposed, us¬ 
ing the fine-tuned parameters. 

6.1. SA-ACSP; Parameter Tuning for SA 

In SA, the two parameters essential to performance are 
the best cooling rate with respect to time and cost, and 
the best temperature to be used with the best cooling rate. 
All the other parameters for SA-ACSP are kept unchanged 
during these tests. 

The cooling rate is used to determine the amount of 
decrease in the temperature value at each iteration. We 
tested SA-ACSP with various cooling rate values as pre¬ 
sented in Figures and The temperature value for 
these tests is set at 100. Looking at Figures and 
we can observe that the cost slightly decreases with the 
increasing values of cooling rate parameter. In Figure 
however, we also observe that the time increases dramati¬ 
cally for the cooling rate values larger than 0.999. There¬ 
fore, based on the results of these experiments, we selected 
the best cooling rate parameter to be 0.999. 


n100-c25 - 
n100-c40 - 
n 100-c50 
n200-c50 
n200-c75 - 
n50-c10 - 
n50-c20 - 
n50-c25 


1 _ — - --i- —A - -. ..i_. 


0.94 0.96 

Cooling Rate 


0.98 


Graph Name 

Number of Nodes 

Number of Colors 

nSO-clO 

50 

10 

n50-c20 

50 

20 

n50-c25 

50 

25 

nl00-c25 

100 

25 

nl00-c40 

100 

40 

nl00-c50 

100 

50 

n200-c50 

200 

50 

n200-c75 

200 

75 


Table 1: The number of nodes, and colors for the randomly generated 
graph types used in the experiments. 


In the sections to follow, the experimental results are re¬ 
ported, first, separately for each metaheuristic algorithm. 
In each of these sections, we conduct various experiments 
for parameter tuning, namely, to find the optimal values of 


Figure 5: Minimum cost for various cooling rate values in SA-ACSP. 

The temperature value in SA-ACSP controls the prob¬ 
ability of choosing worse paths in order to not get stuck 
at a local minimum. We conducted simulations with vari¬ 
ous temperature values on all graph types, and present the 
cost, and the CPU times in Figures and Based 
on the results in these figures, we selected 1000 as the best 
temperature value. 

6.2. ACO-ACSP; Parameter Tuning for AGO 

The behavior of ACO-ACSP depends on four separate 
parameters. In order to find the optimal value for each 
parameter, we conducted a series of experiments for each 
individual parameter. In each experiment, all the other 
parameters are kept constant, and only the specified pa¬ 
rameters are tested for various values, and the cost, and 


8 



















n100-c25 —^— 
n100-c40 —-X—- 
n100-c50 -••*••• 

n200-c50 .□. 

n200-c75 - - 

n50-c10 - -o • 
n50-c20 - -• - 
n50-c25 - ^ - 


* X--X... 


-Q.-o-©-e-©• -CD 

0.98 1 


0.94 0.96 

Cooling Rate 


n100-c25 —^— 
n100-c40 ---X--- 
n100-c50 

n200-c50 .a. 

n200-c75 -- 
n50-c10 - - G - 
n50-c20 -- - 

n50-c25 ■-£>■- 


4000 6000 

Temperature 


Figure 6: Average cost for various cooling rate values in SA-ACSP. Figure 8: Minimum cost for various temperature values in SA-ACSP. 



Cooling Rate 

Figure 7: CPU time for calculating the average cost values for various 
cooling rate values in SA-ACSP. 
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Figure 9: Average cost for various temperature values in SA-ACSP. 


the runtime values observed are recorded. The experi¬ 
ments conducted can be categorized into alpha-beta tests, 
colony size tests, and probability tests. 

Alpha, and beta values are the two parameters used 
for edge selection in ACO-ACSP. The parameter alpha 
denotes the importance of the pheromone levels on the 
edges while calculating probabilities. Beta, on the other 
hand, represents the importance of the edge weights. The 
experiments are designed to decide on the combination of 
alpha, and beta values that gives us the best result in 
terms of the cost, and the CPU time. The results of the 
experiments are presented in Figures [TH[T^ and[T^ These 
figures report the results with respect to the CPU time, 
the average cost, and the minimum cost, and based on the 
results, we selected the alpha value as 0.4, and beta value 
as 0.5. 

In ACO-ACSP, the colony size represents the number 
of active ants deployed at each iteration of the algorithm. 
Having a larger colony increases the chances for finding 
solutions closer to the optimal, however, at the cost of 


increasing the overall runtime. Therefore, we test ACO- 
ACSP for various colony sizes to decide on the optimal 
colony size that can achieve a minimal cost solution in an 
acceptable time period. The results of the experiments are 
presented in Figures [M} [TSl andflB} As we can clearly see 
from the figures, the runtime increases linearly in the size 
of the colony, and the cost decreases only slightly for colony 
sizes larger than 200. Based on these results, we selected 
the colony size as 200 in the rest of the experiments. 


The edge selection probabilities in ACO-ACSP are cal¬ 
culated based on either the level of pheromones on the 
edge or the edge weight itself. The results of the experi¬ 
ments conducted to find the optimal probability value is 
presented in Figures 17 18 and[^ A close inspection of 
these figures reveal that both the cost, and the runtime 
increase for probability values larger than 0.95. Based 
on this observation, therefore, we selected the probability 
value as 0.9 to be used throughout the rest of the experi¬ 
ments. 
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Figure 10: CPU time for calculating average cost for various tem¬ 
perature values in SA-ACSP. 


Figure 13: CPU time for calculating the average cost for various 
combinations of alpha, and beta values in ACO-ACSP. 
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Figure 11: Minimum cost for various combinations of alpha, and 
beta values in ACO-ACSP. 
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Figure 14: Minimum cost for various colony size values in ACO- 

ACSP. 



Figure 15: Average cost for various colony size values in ACO-ACSP. 

Figure 12: Average cost for various combinations of alpha, and beta 

values m ACO-ACSP. GA-ACSF: Parameter Tuning for GA 

In GA-ACSP, the parameters investigated are the mu¬ 
tation probability, the population size, and the iteration 
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Figure 16: CPU time for calculating the average cost for various Figure 19: CPU time for calculating the average cost for various edge 
colony sizes in ACO-ACSP. selection probability values in ACO-ACSP. 
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Figure 17: Minimum cost for various edge selection probability values 
in ACO-ACSP. 



Probability 


Figure 18: Average cost for various edge selection probability values 
in ACO-ACSP. 


size. The results of the tests on the mutation probability 


are presented in Figures and It can be seen 

from the figures that the changes on the mutation prob¬ 
ability do not affect the cost dramatically. Therefore, to 
prevent over-randomization of chromosomes, we selected 
for the mutation probability a value as low as 0.1. 



Mutation Probability 


Figure 20: Minimum cost for various mutation probability values in 
GA-ACSP. 

We also conducted experiments to find the optimal iter¬ 
ation size. The results of the experiments are presented in 
Figures [24} and Based on the results, we decided 
to select the iteration size as 6000, as it provides the best 
trade-off between the cost, and the runtime. 

We also experimented on various population sizes to find 
the best population size for GA-ACSP. The results of the 
experiments are presented in Figures !^ and[^ Based 
on the results, the population size is selected as 600 as both 
the running time, and the cost at this value of population 
size are lower than they are at larger population sizes. 
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Figure 21: Average cost for various mutation probability values in 
GA-ACSP. 
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Figure 22: CPU time for calculating the average cost for various 
mutation probability values in GA-ACSP. 




Figure 24: Average cost for various iteration sizes in GA-ACSP. 



Figure 25: CPU time for calculating the average cost for various 
iteration sizes in GA-ACSP. 
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Figure 26: Minimum cost for various population sizes in GA-ACSP. 




' 

'n100-c25'- 



n100-c50 - 



n200-c50 

-•o. - 


n200-c75 - 

_ 


n50-c10 - 

-G- 


n50-c20 - 

- 


n50-c25 







- 

--- 




GCm-O.G - 


Figure 23: Minimum cost for various iteration sizes in GA-ACSP. g ^ Comparing the Heuristic Algorithms 

In this section, we first present the performance of 
the LP relaxation based heuristics, namely LP^ACSP, 
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Figure 27: Average cost for various population sizes in GA-ACSP. 
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Figure 28: CPU time for calculating the average cost for various 
population sizes in GA-ACSP. 


LPfACSP, and LPf/^ACSP in FigureThe results are 
reported in proportion to the optimal values obtained via 
the ILP formulation. Next, we compare the performance 
of the metaheuristic algorithms, SA-ACSP, ACO-ACSP, 
and GA-ACSP. The results for them are presented, again 
in proportion to the optimal values, in Figure [30l For each 
individual metaheuristic algorithm, in these tests, the best 
parameter values discovered are used. We use randomly 
generated graphs for the types presented in Table 

Based on the experimental results, it is observed that 
the total path length returned by SA-ACSP is better than 
that returned by ACO-ACSP for medium-sized graphs. 
In contrast, ACO-ACSP finds lower cost paths compared 
to SA-ACSP for larger graphs. The performance of GA- 
ACSP ^ in terms of solution quality, is similar to the other 
two metaheuristics. It has, however, a remarkable advan¬ 
tage in terms of time spent over the other two on all types 
of graphs. 


7. Conclusion 

In this paper, a novel, and generic problem. All Col¬ 
ors Shortest Path [ACSP) problem, has been formulated, 
and computationally explored. ACSP has been shown to 
be NP-hard, and also inapproximable within a constant 
factor of the optimal. An ILP formulation has been devel¬ 
oped for ACSP. Various heuristic solutions have then been 
devised, based on iterative rounding applied to an LP re¬ 
laxation of the ILP formulation. Moreover, three different 
metaheuristic solutions based on simulated annealing, ant 
colony optimization, and genetic algorithm have been pro¬ 
posed. Through extensive simulations, an experimental 
evaluation of all the heuristics have also been reported. 

The study of the computational characteristics of ACSP 
when the underlying graph is restricted to be a tree is a 
future work. Investigation of an approximation bound is 
left as an interesting open problem. 
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Time 

n50-cl0 

40 

3.15 
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1.03 
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n50-c20 
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1.8614 
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1.357 
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2.1449 
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3.37 

nl00-c40 

220 

1.9955 
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1.4591 
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1.8112 
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1.6906 
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399 

1.9599 
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1.3258 

30.716 

1.4837 

23.166 


Figure 29: Comparison of ILP, LPxACSP, LPfACSP^ and LPj^y^ACSP. 
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SA-ACSP 
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Figure 30: Comparison of SA-ACSP, ACO-ACSP, and GA-ACSP with their best parameters. 
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